Skip to content

Conversation

mattf
Copy link
Collaborator

@mattf mattf commented Aug 16, 2025

Implements optional idempotency for batch creation using idem_tok parameter:

  • Core idempotency: Same token + parameters returns existing batch
  • Conflict detection: Same token + different parameters raises HTTP 409 ConflictError
  • Metadata order independence: Different key ordering doesn't affect idempotency

API changes:

  • Add optional idem_tok parameter to create_batch() method
  • Enhanced API documentation with idempotency extensions

Implementation:

  • Reference provider supports idempotent batch creation
  • ConflictError for proper HTTP 409 status code mapping
  • Comprehensive parameter validation

Testing:

  • Unit tests: focused tests covering core scenarios with parametrized conflict detection
  • Integration tests: tests validating real OpenAI client behavior

This enables client-side retry safety and prevents duplicate batch creation when using the same idempotency token, following REST API

closes #3144

Implements optional idempotency for batch creation using `idem_tok` parameter:

* **Core idempotency**: Same token + parameters returns existing batch
* **Conflict detection**: Same token + different parameters raises HTTP 409 ConflictError
* **Metadata order independence**: Different key ordering doesn't affect idempotency

**API changes:**
- Add optional `idem_tok` parameter to `create_batch()` method
- Enhanced API documentation with idempotency extensions

**Implementation:**
- Reference provider supports idempotent batch creation
- ConflictError for proper HTTP 409 status code mapping
- Comprehensive parameter validation

**Testing:**
- Unit tests: focused tests covering core scenarios with parametrized conflict detection
- Integration tests: tests validating real OpenAI client behavior

This enables client-side retry safety and prevents duplicate batch creation
when using the same idempotency token, following REST API
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 16, 2025
Copy link

graphite-app bot commented Aug 16, 2025

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • add-to-merge-queue - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

@mattf
Copy link
Collaborator Author

mattf commented Aug 16, 2025

@franciscojavierarceo ptal

endpoint: str,
completion_window: Literal["24h"],
metadata: dict[str, str] | None = None,
idem_tok: str | None = None, # intentionally bad name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this comment mean?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem_tok is an intentionally bad name.

Copy link
Collaborator

@franciscojavierarceo franciscojavierarceo Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, i would have just called it idempotency_key, was the choice here because that's too verbose?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly this. choice was to get feedback.

it's a non-standard param. chances are other APIs will follow the lead here.

idem_tok -> idempotency_key ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am preferential to idempotency_key because I actually had to confirm that idempotency_token was equivalent to it (which it is according to lots of google hits), so I imagine it's just convention exposure and key was exposed to me first. token can be an unfortunately loaded term in our ai land fwiw.

# allowing us to detect parameter conflicts
if idem_tok is not None:
hash_input = idem_tok.encode("utf-8")
hash_digest = hashlib.sha256(hash_input).hexdigest()[:24]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default batch id use's a 16 char hex section, is there a reason to use a different length here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

secret way to tell the difference. happy to align them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh that's fine then. i personally like to add a prefix for those reasons (OpenAI follows this but they don't expose an idempotency key) but different size is okay.

# For idempotent requests, use the idempotency token for the batch ID
# This ensures the same token always maps to the same batch ID,
# allowing us to detect parameter conflicts
if idem_tok is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed we would generate the idempotency key on the server based on the concatenation of input metadata (similar to how we generate chunk_id) and then store that in the DB, rather than expose it in the API.

could you elaborate on the rational to have it defined by the user of the client?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openai's /v1/batches allows for creating duplicate batches. if we don't expose something and de-dup server side, we'll break semantics.

another approach i considered was user passed allow_duplicates, which defaulted to true.

but the common approach for this functionality is a user controlled key.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idempotency is always a client side concept. the client is telling the server the unique context under which the request was generated so that the server can dedupe things correctly.

Copy link
Collaborator

@franciscojavierarceo franciscojavierarceo Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you both are right of course. 🤦🏻

I forgot that this should be generated by the client.

https://medium.com/@sahintalha1/the-way-psps-such-as-paypal-stripe-and-adyen-prevent-duplicate-payment-idempotency-keys-615845c185bf

Copy link
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, some small nits

@mattf mattf requested a review from yanxi0830 as a code owner August 19, 2025 20:34


@pytest.fixture
async def provider():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a rather non-specific name for a fixture which isn't actually that general. can you rename this batches_provider?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am merging this anyhow as I prepare towards a release.

@ashwinb ashwinb merged commit cffc4ed into llamastack:main Aug 22, 2025
22 checks passed
franciscojavierarceo pushed a commit to franciscojavierarceo/llama-stack that referenced this pull request Aug 25, 2025
Implements optional idempotency for batch creation using `idem_tok`
parameter:

* **Core idempotency**: Same token + parameters returns existing batch
* **Conflict detection**: Same token + different parameters raises HTTP
409 ConflictError
* **Metadata order independence**: Different key ordering doesn't affect
idempotency

**API changes:**
- Add optional `idem_tok` parameter to `create_batch()` method
- Enhanced API documentation with idempotency extensions

**Implementation:**
- Reference provider supports idempotent batch creation
- ConflictError for proper HTTP 409 status code mapping
- Comprehensive parameter validation

**Testing:**
- Unit tests: focused tests covering core scenarios with parametrized
conflict detection
- Integration tests: tests validating real OpenAI client behavior

This enables client-side retry safety and prevents duplicate batch
creation when using the same idempotency token, following REST API

closes llamastack#3144
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Idempotency Support to Batches API

4 participants